{ "cells": [ { "cell_type": "markdown", "id": "db4d7066", "metadata": {}, "source": [ "# Creating a Custom Data Source\n", "\n", "**PyBroker** comes with pre-built [DataSources](https://www.pybroker.com/en/latest/reference/pybroker.data.html#pybroker.data.DataSource) for [Yahoo Finance](https://www.pybroker.com/en/latest/reference/pybroker.data.html#pybroker.data.YFinance), [Alpaca](https://www.pybroker.com/en/latest/reference/pybroker.data.html#pybroker.data.Alpaca), and [AKShare](https://github.com/akfamily/akshare), which you can use right away without any additional setup. But if you have a specific need or want to use a different data source, **PyBroker** also allows you to create your own ```DataSource``` class.\n", "\n", "\n", "## Extending DataSource\n", "\n", "In the example code provided below, a new ```DataSource``` called ```CSVDataSource``` is implemented, which loads data from a CSV file. The ```CSVDataSource``` reads a file named ```prices.csv``` into a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), and then returns the data from this DataFrame based on the input parameters provided:" ] }, { "cell_type": "code", "execution_count": 1, "id": "f7e59aa2", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import pybroker\n", "from pybroker.data import DataSource\n", "\n", "class CSVDataSource(DataSource):\n", " \n", " def __init__(self):\n", " super().__init__()\n", " # Register custom columns in the CSV.\n", " pybroker.register_columns('rsi')\n", " \n", " def _fetch_data(self, symbols, start_date, end_date, _timeframe, _adjust):\n", " df = pd.read_csv('data/prices.csv')\n", " df = df[df['symbol'].isin(symbols)]\n", " df['date'] = pd.to_datetime(df['date'])\n", " return df[(df['date'] >= start_date) & (df['date'] <= end_date)]" ] }, { "cell_type": "markdown", "id": "a3abf367", "metadata": {}, "source": [ "To make the custom ```'rsi'``` column from the CSV file available to **PyBroker**, we register it using [pybroker.register_columns](https://www.pybroker.com/en/latest/reference/pybroker.scope.html#pybroker.scope.register_columns). This allows **PyBroker** to use this custom column when it processes the data.\n", "\n", "It's important to note that when returning the data from your custom DataSource, it must include the following columns: ```symbol```, ```date```, ```open```, ```high```, ```low```, and ```close```, as these columns are expected by **PyBroker**.\n", "\n", "Now we can query the CSV data from an instance of ```CSVDataSource```:" ] }, { "cell_type": "code", "execution_count": 2, "id": "08a7e845", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loading bar data...\n", "Loaded bar data: 0:00:00 \n", "\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
datesymbolopenhighlowclosersi
02021-06-01DIS180.179993181.009995178.740005178.83999646.321532
12021-06-01MCD235.979996235.990005232.740005233.24000546.522926
22021-06-01NKE137.850006138.050003134.210007134.50999553.308085
32021-06-02DIS179.039993179.100006176.929993177.00000042.635256
42021-06-02MCD233.970001234.330002232.809998233.77999948.051484
........................
3822021-11-30MCD247.380005247.899994243.949997244.60000640.461178
3832021-11-30NKE168.789993171.550003167.529999169.24000551.505558
3842021-12-01DIS146.699997148.369995142.039993142.14999416.677555
3852021-12-01MCD245.759995250.899994244.110001244.17999339.853689
3862021-12-01NKE170.889999173.369995166.679993166.69999746.704527
\n", "

387 rows × 7 columns

\n", "
" ], "text/plain": [ " date symbol open high low close \\\n", "0 2021-06-01 DIS 180.179993 181.009995 178.740005 178.839996 \n", "1 2021-06-01 MCD 235.979996 235.990005 232.740005 233.240005 \n", "2 2021-06-01 NKE 137.850006 138.050003 134.210007 134.509995 \n", "3 2021-06-02 DIS 179.039993 179.100006 176.929993 177.000000 \n", "4 2021-06-02 MCD 233.970001 234.330002 232.809998 233.779999 \n", ".. ... ... ... ... ... ... \n", "382 2021-11-30 MCD 247.380005 247.899994 243.949997 244.600006 \n", "383 2021-11-30 NKE 168.789993 171.550003 167.529999 169.240005 \n", "384 2021-12-01 DIS 146.699997 148.369995 142.039993 142.149994 \n", "385 2021-12-01 MCD 245.759995 250.899994 244.110001 244.179993 \n", "386 2021-12-01 NKE 170.889999 173.369995 166.679993 166.699997 \n", "\n", " rsi \n", "0 46.321532 \n", "1 46.522926 \n", "2 53.308085 \n", "3 42.635256 \n", "4 48.051484 \n", ".. ... \n", "382 40.461178 \n", "383 51.505558 \n", "384 16.677555 \n", "385 39.853689 \n", "386 46.704527 \n", "\n", "[387 rows x 7 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "csv_data_source = CSVDataSource()\n", "df = csv_data_source.query(['MCD', 'NKE', 'DIS'], '6/1/2021', '12/1/2021')\n", "df" ] }, { "cell_type": "markdown", "id": "4a21ca3b", "metadata": {}, "source": [ "To use ```CSVDataSource``` in a backtest, we create a new [Strategy](https://www.pybroker.com/en/latest/reference/pybroker.strategy.html#pybroker.strategy.Strategy) object and pass the custom ```DataSource```:" ] }, { "cell_type": "code", "execution_count": 3, "id": "e1238ecd", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Backtesting: 2021-06-01 00:00:00 to 2021-12-01 00:00:00\n", "\n", "Loading bar data...\n", "Loaded bar data: 0:00:00 \n", "\n", "Test split: 2021-06-01 00:00:00 to 2021-12-01 00:00:00\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100% (129 of 129) |######################| Elapsed Time: 0:00:00 Time: 0:00:00\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Finished backtest: 0:00:02\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
typesymboldateshareslimit_pricefill_pricefees
id
1buyNKE2021-09-21100NaN154.860.0
2sellNKE2021-11-04100NaN173.820.0
3buyDIS2021-11-16100NaN159.400.0
\n", "
" ], "text/plain": [ " type symbol date shares limit_price fill_price fees\n", "id \n", "1 buy NKE 2021-09-21 100 NaN 154.86 0.0\n", "2 sell NKE 2021-11-04 100 NaN 173.82 0.0\n", "3 buy DIS 2021-11-16 100 NaN 159.40 0.0" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from pybroker import Strategy\n", "\n", "def buy_low_sell_high_rsi(ctx):\n", " pos = ctx.long_pos() \n", " if not pos and ctx.rsi[-1] < 30:\n", " ctx.buy_shares = 100\n", " elif pos and ctx.rsi[-1] > 70:\n", " ctx.sell_shares = pos.shares\n", "\n", "strategy = Strategy(csv_data_source, '6/1/2021', '12/1/2021')\n", "strategy.add_execution(buy_low_sell_high_rsi, ['MCD', 'NKE', 'DIS'])\n", "result = strategy.backtest()\n", "result.orders" ] }, { "cell_type": "markdown", "id": "d3c94d73", "metadata": {}, "source": [ "Note that because we registered the custom ```rsi``` column with **PyBroker**, it can be accessed in the [ExecContext](https://www.pybroker.com/en/latest/reference/pybroker.context.html#pybroker.context.ExecContext) using ```ctx.rsi```." ] }, { "cell_type": "markdown", "id": "e0eb8e5d", "metadata": {}, "source": [ "## Using a Pandas DataFrame\n", "\n", "If you do not need the flexibility of implementing your own [DataSource](https://www.pybroker.com/en/latest/reference/pybroker.data.html#pybroker.data.DataSource), then you can pass a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) to a ``Strategy`` instead.\n", "\n", "To demonstrate, the earlier example can be re-implemented as follows:" ] }, { "cell_type": "code", "execution_count": 4, "id": "eecb37d6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Backtesting: 2021-06-01 00:00:00 to 2021-12-01 00:00:00\n", "\n", "Test split: 2021-06-01 00:00:00 to 2021-12-01 00:00:00\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100% (129 of 129) |######################| Elapsed Time: 0:00:00 Time: 0:00:00\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Finished backtest: 0:00:00\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
typesymboldateshareslimit_pricefill_pricefees
id
1buyNKE2021-09-21100NaN154.860.0
2sellNKE2021-11-04100NaN173.820.0
3buyDIS2021-11-16100NaN159.400.0
\n", "
" ], "text/plain": [ " type symbol date shares limit_price fill_price fees\n", "id \n", "1 buy NKE 2021-09-21 100 NaN 154.86 0.0\n", "2 sell NKE 2021-11-04 100 NaN 173.82 0.0\n", "3 buy DIS 2021-11-16 100 NaN 159.40 0.0" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv('data/prices.csv')\n", "df['date'] = pd.to_datetime(df['date'])\n", "\n", "pybroker.register_columns('rsi')\n", "\n", "strategy = Strategy(df, '6/1/2021', '12/1/2021')\n", "strategy.add_execution(buy_low_sell_high_rsi, ['MCD', 'NKE', 'DIS'])\n", "result = strategy.backtest()\n", "result.orders" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.16" } }, "nbformat": 4, "nbformat_minor": 5 }